Ct 854/change overzealous connection closing #428

VersusFacit · 2023-01-31T10:22:15Z

resolves #201
closes #203

Description

Each time dbt talks to a Snowflake warehouse, connections are established and reestablished repeatedly. That's a lot of time wasted on this step. Why not keep the connection(s) already opened maintained for the duration of the execution? That's what this PR does.

Teasing out the existing behavior

Set threads to 1 in your profiles.yml, without the reuse_connection param, and watch entries arrive in

select *
from table(information_schema.login_history_by_user())
order by event_timestamp desc;

New behavior

Just add a
to your Snowflake connection in profiles.yml.
and a dbt run over Jaffle shop goes from 7 unique login entries in Snowflake to 1. The timesave is upwards of 2/3. Pretty big deal!

Why this?

@joshuataylor (❤️) gave us a proof of concept fix in https://github.com/dbt-labs/dbt-snowflake/pull/203. I simplified the code down and added lines to enable users to plug and play with profiles.yml. I also added unit testing for this.

I was worried about thread safety since self.lock is a bit of obtuse automagic. Eventually, I realized instead of reinventing the wheel, I could just super() up to SQLConnectionManager from dbt-core and it's many safety checks. Otherwise, we're kind of at the mercy of largely undocumented dbt-snowflake-connector methods.

A bug

Jeremy warns of this here. I ran into something like this once without looking for it. Every model took some <2 seconds and the last thread hung open for 8. Trials did not trigger this for any combination of client_session_keep_alive and request_connection except for the former as True and the latter as False respectively.

~~We could try to manually close threads but setting an explicit timeout seems sketchy to me in the opposite direction.~~

I thought about closing any unused threads but how would you do something like that in this Python runtime? At which point do you have it begin running over open threads and decide which to close prematurely. If anyone has the idea, by all means, I'm super game. Just unsure of how to.

Things I'm uncomfy with

Technically with this solution, the connection isn't ever closed. This shows up in the above "bug." It's just cleaned up when the process exits. Acceptable or not in cases besides the bug above?

Update: We close all thread connections in the adapter context today before trying to teardown the dbt runtime, so this has been treated.

Did you document this?

That I did. Corresponding docs PR up over yonder.

Checklist

I have read the contributing guide and understand what's expected of me
I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have opened an issue to add/update docs, or docs changes are not required/relevant for this PR
I have run changie new to create a changelog entry

…ous_connection_closing

Default still to release.

github-actions · 2023-01-31T10:22:34Z

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the dbt-snowflake contributing guide.

VersusFacit · 2023-01-31T10:45:57Z

dbt/adapters/snowflake/connections.py

@@ -151,6 +152,18 @@ def auth_args(self):
            result["client_store_temporary_credential"] = True
            # enable mfa token cache for linux
            result["client_request_mfa_token"] = True
+
+        # Warning: This profile configuration can result in specific threads, even just one,


Context/Alternatives in PR description.

My proposed way of handling what appears to be a rare bug. Warn people against it. Don't encourage it.

VersusFacit · 2023-01-31T22:54:22Z

~~Random thought: Going to check if sessions hang open in the console if able to get even more info on that "bug". Still, the questions above on how we want to proceed apply.~~ Console lists GUI, not CI sign ins so no help there.

McKnight-42

oh great pick up! wouldn't of wanted this one to disappear , all seems okay to me!!

joshuataylor · 2023-02-03T14:14:46Z

🎉!!! This is great, what a fantastic solution.

Could the issue with connection hanging be related to latency? That's my gut feeling with this, is that something funky with timeouts or Snowflake having intermittent issues.

I'll see if I can run tracing with this and just keep repeating the runs until it happens. I'm ~250ms away, so we'll see how often it happens.

To test hanging connections, I'll try the following:

Create a new user in Snowflake
Set STATEMENT_TIMEOUT_IN_SECONDS for this user to a low value, like 15-30 seconds. This way it doesn't hold everything else up.
Set threads: 4 in dbt_profiles.yml, if it doesn't trigger, try 2, if that doesn't work.. 8.
Have a few small test models that will keep being rebuilt
dbt run 🏃

If this works, I'll try with larger models, to see if that triggers it. I'm sure there is a correlation between either shorter runs or longer ones. Or maybe it's just random 🤷.

mikealfare

I wonder if this is related to the issue that we're seeing in local unit tests, and in the scenario where we don't specify the number of workers for unit tests.

VersusFacit · 2023-02-06T20:24:16Z

@joshuataylor After digging in this more, my only real concern was with this behavior resulting in financial cost to users without their knowing -- appears NOT to be the case both in docs and in anecdotes from users.

I believe the bug I ran into -- where execution on one thread hung open for an unusual amount of time -- was related to intermittent snowflake bottlenecks rather than anything caused by leaving open threads in the pool.

VersusFacit · 2023-02-07T02:04:08Z

Revised code to meet some design specs and confirmed the following after talking to snowflake folks:

threads must be cleaned up before Python teardown due to nondeterministic order for tearing down modules the connector package depends on (huh!)
buuuut, an at atexit handler for connections isn't needed

I ran two dozen trials on over a dozen models with all manner of thread and flag combinations -- never once did the connections hang; the cleanup_all function we indirectly call through core does the heavy lifting here
also, even if this cleanup weren't present, the only way leaving threads open would be problematic would be if one were using undocumented behaviors (thanks Snowflake team for that insight!)

client_keep_alive_connection has no visible conflicts! So, I removed the warning code.

This one was an adventure and I thank everyone for their patience/insight in getting us to knowing these things!

…. Add comments.

nssalian

lgtm. Thanks for digging in and investigating this to understand + confirm the underlying behavior.

McKnight-42

LGTM great catch on flip!

joshuataylor and others added 12 commits July 20, 2022 12:44

Add release_connection setting to not release the connection

3f824d0

Merge branch 'main' into feature/connection-release

e33c0dd

Merge branch 'main' into feature/connection-release

020e31b

Merge branch 'main' into feature/connection-release

b2b7ba5

Merge branch 'main' into feature/connection-release

09e57a3

Merge branch 'main' into feature/connection-release

1c23d7b

Merge branch 'main' into feature/connection-release

4327237

remove space

cccffea

Merge branch 'main' into feature/connection-release

f5863ed

Merge branch 'feature/connection-release' into CT-854/change_overzeal…

ca468fa

…ous_connection_closing

Simplify code and add profiles.yml integration.

8ec71a5

Default still to release.

Add release_connection to unit tests

7edf782

VersusFacit requested review from mikealfare, McKnight-42 and colin-rogers-dbt January 31, 2023 10:22

VersusFacit self-assigned this Jan 31, 2023

cla-bot bot added the cla:yes label Jan 31, 2023

VersusFacit added 2 commits January 31, 2023 02:27

Add changelog

ff047a0

Add warning for combination that can result in a bug.

3614d92

VersusFacit commented Jan 31, 2023

View reviewed changes

Merge branch 'main' into CT-854/change_overzealous_connection_closing

3eb1073

nssalian self-requested a review February 1, 2023 18:10

Fleid added pr_tracked ready_for_review Externally contributed PR has functional approval, ready for code review from Core engineering labels Feb 1, 2023

McKnight-42 approved these changes Feb 3, 2023

View reviewed changes

mikealfare approved these changes Feb 6, 2023

View reviewed changes

Merge branch 'main' into CT-854/change_overzealous_connection_closing

70fcea8

Rename config var with None default, change up tests. Remove warnings…

a2b6b70

…. Add comments.

VersusFacit requested review from mikealfare and McKnight-42 February 7, 2023 02:40

nssalian approved these changes Feb 7, 2023

View reviewed changes

McKnight-42 approved these changes Feb 7, 2023

View reviewed changes

Merge branch 'main' into CT-854/change_overzealous_connection_closing

8b246ce

VersusFacit merged commit ad9cae8 into main Feb 7, 2023

VersusFacit deleted the CT-854/change_overzealous_connection_closing branch February 7, 2023 07:03

joshuataylor mentioned this pull request Mar 17, 2023

[CT-858] [Enhancement] Connection is always closed after each query dbt-labs/dbt-core#5489

Open

1 task

dbeatty10 mentioned this pull request May 5, 2023

[CT-2538] [Feature] Reuse connections on more adapters (specifically postgres) dbt-labs/dbt-adapters#83

Closed

2 tasks

VersusFacit mentioned this pull request Jun 1, 2023

Restore transaction semantics used by dbt-redshift prior to 1.5 dbt-labs/dbt-redshift#475

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ct 854/change overzealous connection closing #428

Ct 854/change overzealous connection closing #428

VersusFacit commented Jan 31, 2023 •

edited

Loading

github-actions bot commented Jan 31, 2023

VersusFacit Jan 31, 2023 •

edited

Loading

VersusFacit commented Jan 31, 2023 •

edited

Loading

McKnight-42 left a comment

joshuataylor commented Feb 3, 2023

mikealfare left a comment

VersusFacit commented Feb 6, 2023 •

edited

Loading

VersusFacit commented Feb 7, 2023 •

edited

Loading

nssalian left a comment

McKnight-42 left a comment

Ct 854/change overzealous connection closing #428

Ct 854/change overzealous connection closing #428

Conversation

VersusFacit commented Jan 31, 2023 • edited Loading

Description

Teasing out the existing behavior

New behavior

Why this?

A bug

Things I'm uncomfy with

Did you document this?

Checklist

github-actions bot commented Jan 31, 2023

VersusFacit Jan 31, 2023 • edited Loading

Choose a reason for hiding this comment

VersusFacit commented Jan 31, 2023 • edited Loading

McKnight-42 left a comment

Choose a reason for hiding this comment

joshuataylor commented Feb 3, 2023

mikealfare left a comment

Choose a reason for hiding this comment

VersusFacit commented Feb 6, 2023 • edited Loading

VersusFacit commented Feb 7, 2023 • edited Loading

nssalian left a comment

Choose a reason for hiding this comment

McKnight-42 left a comment

Choose a reason for hiding this comment

VersusFacit commented Jan 31, 2023 •

edited

Loading

VersusFacit Jan 31, 2023 •

edited

Loading

VersusFacit commented Jan 31, 2023 •

edited

Loading

VersusFacit commented Feb 6, 2023 •

edited

Loading

VersusFacit commented Feb 7, 2023 •

edited

Loading